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I  INTRODUCTION 


The  central  theme  of  our  research  is  the  recovery  of  information 
about  the  three-dimensional  structure  and  physical  characteristics  of 
surfaces  depicted  in  an  Image  —  their  shapes,  locations,  and 
photometric  properties.  The  main  obstacle  to  surface  recovery  is  the 
confounding  of  the  desired  properties  In  the  sensory  data:  Images  are 
inherently  ambiguous.  Our  approach  to  resolving  this  ambiguity  rests  on 
the  application  of  generic,  iow-levei  knowledge  (e.g.,  such  basic 
assumptions  as  surface  continuity  and  general  position)  to  constrain  the 
interpretation.  The  problem  may  be  viewed  as  that  of  decomposing  the 
image  into  its  physically  meaningful  constituents  —  surface 
orientation,  reflectance,  illumination,  and  so  on.  The  "Intrinsic  image 
model"  provides  a  conceptual  and  computational  framework  in  which  this 
view  is  made  expi left. 

Surface  perception  plays  a  fundamental  role  In  early  visual 
processing,  both  In  humans  and  in  machines.  An  explicit  representation 
of  surface  structure  is  necessary  for  many  Iow-levei  visual  functions 
involved  in  such  applications  as  terrain  modeling,  remote  sensing, 
navigation,  manipulation,  and  obstacle  avoidance.  it  Is  also  a 
prerequisite  for  general-purpose  vision  systems  capable  of  human-level 
performance  in  such  tasks  as  object  recognition  and  scene  description. 

Work  on  surface  perception  has  focused  on  the  discrimination  of 
edge  types  (e.g.,  extremal  boundary  or  cast  shadow),  on  the  three- 
dimensional  Interpretation  of  edges,  and  on  surface  reconstruction  by 
interpolating  from  edges  and  using  texture  geometry. 
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ii  RESEARCH  ACCOMPLISHMENTS 


Much  of  our  earlier  work  on  three-dimensional  interpretation  of 
edges  and  texture  assumed  a  capability  to  discriminate  edges  of  distinct 
physical  types:  extremal  edges,  shadow  boundaries,  discontinuities  of 
surface  orientation,  and  reflectance  edges.  Each  edge  type  imposes 
distinct  constraints  on  surface  recovery,  but  these  constraints  cannot 
be  exploited  unless  edges  can  be  rel iably  classified.  Existing  edge- 
class  i  f  i  cat  i  or.  techniques  based  on  junction  catalogues  and  constraint 
propagation  depend  critically  on  Ideal  data,  and  are  therefore 
inadequate  for  natural  Imagery.  For  these  reasons,  our  work  focused  on 

developing  new  edge-c lass  if icat ion  techniques  that  could  be  applied  to 
natural  imagery,  and  as  a  result  of  this  effort,  we  developed  and 
implemented  a  .  new  i ntens i ty-based  approach  to  edge  classification. 
Using  basic  properties  of  scenes  and  images,  we  deduced  signatures  for 
several  edge  types  that  are  expressed  in  terms  of  correlational 
properties  of  the  image  intensities  in  the  neighborhood  of  the  edge,  and 
developed  a  computer  program  that  evaluates  image  edges  against  these 
prototype  signatures.  The  program  effectively  discriminates  extremal 
boundaries  from  cast  shadow  boundarl es  In  cases  where  traditional 
junction  cues  are  absent  from  the  Image. 

Reports  of  our  previous  work  on  edge  reconstruction,  surface 
Interpolation,  and  shape  recovery  from  texture  have  been  published  in 

professional  journals  [Ref  1-7];  reprints  of  these  papers  are  available 
on  request. 

A.  Edge  Ci as s i f I ca t ion 

Edges  play  a  centra!  role  |„  three-dimensional  surface 
reconstruction.  Crucial  to  exploiting  the  constraints  Imposed  by  edges 
is  edge  sorting  classifying  the  image  edges  according  to  the  type  of 
surface  boundary  they  represent  {e.g.,  extremal  boundaries,  shadow 
edges,  su  C*ce  orientation  discontinuities,  or  t ex t ur e  edges ) .  Because 
each  edge  type  Imposes  different  constraints  on  three-dimensional 
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interpretation,  mi  sc iass i f icat ion  can  lead  to  serious  interpretation 
errors.  Edge  classification  in  line  drawings  has  been  addressed  in 
terms  of  propagation  of  junction  constraints,  giobai  structural  cues 
such  as  parallelism  and  symmetry,  and  giobai  optimization  criteria  on 
the  three-dimensional  interpretation.  Because  these  techniques  depend 
on  perfect  edge  data,  their  applicability  to  natural  imagery  is 
quest i onab i e . 

An  alternative  approach  to  edge  sorting  is  to  use  intensity  and 
spectral  information  in  the  neighborhood  of  the  edge.  Horn  [8] 
suggested  that  the  intensity  profiles  across  edges  (such  as  peak  versus 
step)  couid  provide  signatures  for  some  edge  types.  However,  this 
technique  has  not  worked  for  complex  imagery. 

in  this  section  we  describe  an  intensity-based,  iine-sorting 
technique  that  distinguishes  line  types  by  statistically  comparing 
intensity  variations  along  opposite  sides  of  the  edge.  We  have  focused 
on  two  line  types — extremal  edges  and  cast  shadow  boundaries — but 
extensions  to  other  edge  types  have  also  been  explored. 


B.  Defining  the  Pr ob i em 

Because  line  types  are  defined  in  terms  of  the  scene  events  they 
denote,  any  method  for  line  sorting  must  provide  some  basis  for 
discriminating  those  events  by  their  appearance  in  the  image.  We 
therefore  begin  by  characterizing  the  distinctive  properties  of  extremai 
boundaries  and  cast  shadow  edges,  and  defining  the  computational  problem 
of  identifying  those  edges. 


Extremal  Boundar i es — Projective  mapping  from  image  to  scene 
tends  to  be  continuous  because  physical  surfaces  tend  to 
be  continuous.  Aimost  everywhere  in  a  typical  image, 
therefore,  nearby  points  in  the  image  correspond  to 


nearby  points  in  the  scene.  This  adjacency  is  preserved 
over  any  change  in  point  of  view  or  scene  configuration, 
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scene  is  composed.  The  distinguishing  property  of 
extremal  boundaries  (which  can  be  defined  as 

discontinuities  in  the  projective  mapping)  is  their 
systematic  violation  of  this  rule:  the  apparent 

juxtaposition  of  two  surfaces  across  an  extremal  edge 
represents  no  fixed  property  of  either  surface,  but  is 
subject  to  the  vagaries  of  viewpoint  and  scene 

configuration.  For  example,  if  you  position  your  finger 
to  coincide  with  a  particular  feature  on  the  wa ii  or 
outside  the  window,  a  sma  I  i  change  in  the  pos  i  t i on  o f 
head  or  hand  may  drastically  affect  the  apparent 
relationship.  Because  the  false  appearance  of  proximity 
is  the  hallmark  of  extremal  edges,  the  problem  in 
identifying  those  edges  is  to  distinguish  in  the  image 
the  actual  proximity  of  nearby  points  on  connected 
surfaces  from  accidental  proximity  imposed  by  projection. 

CiSi  Shjidows^  Cast  shadows  in  outdoor  scenes  represent 
transitions  from  direct  to  scattered  illumination  caused 
by  the  interposition  of  an  occluding  body  between  the  sun 
and  the  viewed  surface.  The  pr ob ! em  i n  I  dent i fy | ng  cas t 
shadows  is  to  distinguish  these  transitions  in  incident 
Illumination  from  changes  in  albedo  or  surface 

orientation,  for  example.  This  kind  of  discrimination 
presents  a  problem  because  the  effects  of  ai i  these 
parameters  are  confounded  in  the  image  data— a  change  in 
image  brightness  may  reflect  a  change  In  albedo  or 
surface  orientation,  as  well  as  in  incident  i i i urn i nat ion . 
Because  the  relation  among  i i iumination,  reflectivity, 
orientation,  and  image  irradiance  i s  we i i  known,  the 
presence  of  shadows  in  an  image  could  be  readily  detected 
if  a  constant  planar  reference  pattern  could  be  placed  in 
the  scene;  when  the  apparent  brightness  of  a  constant 
pattern  varies  with  location,  the  change  in  brightness 
must,  by  elimination,  be  attributed  to  a  change  in 
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i i i uminat Ion .  Of  course,  such  active  intervention  is 
generally  impractical;  the  problem  is  to  achieve  the 
effect  of  viewing  a  constant  pattern  across  the  shadow 
edge  without  actually  placing  such  a  pattern  in  the 
scene.  This  effect  could  be  achieved  if  some  fixed 
relationship  were  known  to  hold  between  the  surface 
strips  on  each  side  of  the  shadow  edge. 

in  short,  extremal  boundaries  are  curves  across  which  distant 
points  in  space  are  placed  in  apparent  juxtaposition  by  projection, 
violating  the  continuity  of  the  projective  mapping  that  holds  over  mosi 
of  the  image.  To  identify  extremal  boundaries  requires,  therefore,  that 
actual  proximity  be  distinguished  from  apparent  proximity  imposed  by 
projection.  Cast  shadow  edges  are  contours  across  which  the  pattern  of 
surface  reflectance  has  been  systematically  transformed  by  an  abrupt 
change  in  Illumination.  To  ident i fy  cast  shadow  edges,  the  effects  of 
illumination  must  be  d i st  ingul shed  from  those  of  albedo  and  surface 

orientation,  as  if  a  constant  planar  reference  pattern  had  been  placed 
across  the  edge. 

c*  Computational  Theory 

Our  solution  rests  on  the  simple  principle  that  coherence  in  the 
image  reflects  real  coherence  in  the  scene,  rather  than  a  coincidence  of 
the  structure  and  alignment  of  distinct  scene  constituents.  We  measure 
coherence  in  the  neighborhood  of  an  edge  by  performing  a  normalized 
correlation  on  intensity  values  at  corresponding  points  across  the  edge. 

(Other  measures  of  coherence  are  possible,  such  as  continuity  of  linear 
structure.) 

A  high  correlation  implies  that  the  edge  and  its  neighborhood 
correspond  to  a  strip  on  a  connected  surface.  Therefore,  the  edge  is 
not  an  extremal  boundary,  and  furthermore,  the  regions  on  either  side 
can  be  regarded  as  instances  of  a  (statistically)  constant  pattern.  in 
that  case,  the  presence  of  a  shadow  can  be  detected  by  constructing  a 
regression  equation  whose  parameters  signal  any  systematic  photometric 
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distortion  of  the  pattern  across  the  edge.  ideally,  this  distortion  is 
linear,  but  nonlinearites  are  introduced  in  practice  by  c omp i ex  i i gh t i ng 
effects,  film  or  sensor  response,  and  so  forth. 


A  low  correlation  does  not  necessarily  signal  an  extremai  boundary, 
but  could  reflect  low  contrast  or  fragmented  surfacs*  structure. 
However,  the  local  disruptions  of  correlation  that  signal  extremai  edges 
can  be  distinguished  from  a  global  lack-  of  structured  surface  markings 
by  using  a  neighborhood  of  the  Image  around  the  edge  to  set  a  baseline 
for  correlation.  A  contour  of  low  correlation  surrounded  by  regions  of 
high  correlation  is  ilkeiy  to  denote  an  extremal  boundary. 

To  obtain  a  baseline,  the  given  edge  Is  embedded  In  a  family  of 
parallel  curves,  and  a  sequence  of  regressions  performed  from  one  curve 
onto  the  next,  in  terms  of  this  regression  sequence,  the  various  edge 
types  display  distinctive  "signatures"  that  can  be  computed  from  the 
image  data:  extremai  boundaries  display  a  sharp  notch  in  correlation 
where  the  fabric  of  the  projective  mapping  Is  torn  by  the  boundary. 
Cast  shadow  boundaries  sustain  high  correlations  across  the  edge,  but 
sharp  spikes  occur  in  the  regression  parameters  where  the  surface 
structure  is  systematically  transformed  by  the  illumination  transition. 
A  low  correlation  throughout  implies  that  either  the  contrast  is  too  low 
or  the  surface  structure  too  fragmented  for  any  positive  conclusion  to 
be  dr  awn . 

This  strategy  follows  from  the  assumption  that  coherence  in  the 
image — as  measured  by  correlation — implies  a  connected  surface.  The 
rationale  for  this  assumption  follows  from  some  elementary  observations 
on  the  character  of  natural  scenes  and  images.  First,  as  mentioned 
above,  it  follows  from  the  fact  that  surfaces  tend  to  be  continuous  so 
that  nearby  points  in  the  image  usually  correspond  to  nearby  points  in 
the  scene  (i.e.,  the  projective  mapping  is,  In  general,  continuous). 
Second,  because  the  structure  of  surfaces  tends  to  be  coherent,  such 
properties  as  reflectance  and  orientation  at  a  given  point  on  a 
connected  surface  are  (statistically)  good  predictors  of  the  properties 
at  nearby  points.  Third,  because  scenes  are  made  up  of  distinct  objects 
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whose  structure  and  spatial  configuration  are  governed  by  extremely 
complex  factors,  the  properties  of  widely  separated  surface  points,  or 
points  on  surfaces  of  distinct  objects,  can  usually  be  regarded  as 
unrelated  and  independent. 

Because  of  these  three  principles — surface  continuity,  coherence, 
and  i ndependenc e— we  can  expect  intensity  values  at  nearby  Image  points 
to  be  highly  correlated.  (It  Is  easily  verified  that  this  is  so  for 
most  Images.)  That  Is,  a  small  step  in  the  Image  usually  corresponds  to 
a  small  step  on  some  connected  surface,  so  surface  coherence  Imposes  a 
statistical  relation  on  the  properties  of  nearby  points.  Thus,  when  we 
place  the  points  on  either  side  of  an  arbitrary  image  curve  in 
correspondence,  we  should  often  expect  to  see  a  high  correlation  between 
the  Intensity  values  at  those  points.  However,  when  that  small  step 
happens  to  cross  an  extremal  boundary,  the  corresponding  surface  points, 
belong  In  general,  to  distinct  objects,  and  might  be  widely  separated  in 
space.  In  that  case,  the  properties  of  the  points  are  independent. 
Thus,  when  the  points  on  either  side  of  an  extremal  boundary  are  placed 
in  correspondence,  we  should  never  observe  a  high  correlation  unless  the 
surfaces  meeting  at  the  boundary  possess  identical  structures,  and 
happen  to  lie  in  perfect  register  f'om  the  observer's  viewpoint.  The 
likelihood  of  this  occurrence  Is  vanishingly  small. 

Thus,  we  may  confidently  conclude  that  coherence  of  structure 
across  an  image  curve  (as  measured  by  correlation)  denotes  true 
coherence  of  scene  structure  rather  than  an  accident  of  scene 
c  on  f I gur at  I  on . 


D.  Imp  I emen tat  ion  and  Re  su I t  s 

Our  implementation  assumes  that  an  edge  has  been  located  by  edge- 
finding  techniques.  In  practice,  edges  were  often  traced  by  hand; 
automatically  detected  zero-crossing  edges  were  also  used  as  Inputs.  We 
construct  a  parallel  family  of  curves  around  the  edge  by  imposing  a  new 
coordinate  system  on  the  Image  as  follows:  arc  length  on  the  edge  Is 
taken  as  the  y-coordlnate,  and  orthogonal  distance  from  the  edge  (right- 


handed)  as  the  x-coordi nate .  This  amounts  to  coercing  a  strip  around 
the  edge  into  a  rectangular  region  whose  central  column  corresponds  to 
the  original  edge.  The  surrounding  columns  correspond  to  parallel 
curves  on  either  side  of  the  edge.  The  rectangular  strip  was 
constructed  using  bilinear  interpolation  of  intensity  values  to  reduce 
quantization  artifacts. 

Once  the  rectified  strip  was  constructed,  a  sequence  of  iinear 
regressions  was  performed  between  columns.  To  avoid  spurious 
correlation  imposed  by  the  imaging  and  digitizing  process,  regressions 
were  computed  between  the  i-th  column  and  the  (I  +  2 ) t h .  The  outcome  of 
this  computation  was  a  normalized  correlation  coefficient,  additive 
regression  term,  and  multiplicative  regression  term,  each  a  function  of 
the  location  of  the  column.  The  midpoints  of  these  plots  represent  the 
regression  across  the  original  edge. 

in  terms  of  these  regression  sequence  plots,  we  define  the 
following  expected  edge  signatures  (see  Figure  1  for  idealized  plots): 

*  An  extremal  boundary  is  indicated  by  a  sharp  notch  in  an 
otherwise  high  correlation  at  the  nominal  edge  location. 

*  A  cast  shadow  boundary  is  indicated  by  high  correlation 
maintained  across  the  edge,  but  sharp  spikes  or  notches  can 
be  present  in  the  additive  and  multiplicative  regression 
parameter,  depending  on  the  sense  of  the  shadow  transition 
and  the  digitization  function. 

*  A  high  correlation  coefficient  with  no  disturbance  in  the 
regression  parameters  implies  that  the  edge  is  not 
physical iy  significant. 

*  Sustained  low  correi at  ion  implies  low  contrast  or  lack  of 
surface  structure,  and  no  classification  can  be  made. 

No  attempt  has  yet  been  made  to  classify  the  edge  type  signatures 
automatically;  however,  the  computation  was  performed  on  a  number  of 
edges  in  both  aerial  and  ground  imagery.  Examples  of  the  images,  edges, 
and  regression  sequence  plots  are  shown  in  Figures  2  through  6.  The 
regression  plots  should  be  compared  to  the  idealized  signatures  of 
F i gure  1 . 


The  edge-sorting  method  presented  above,  derived  from  basic 
properties  of  visuai  scenes,  shows  promise  as  a  useful  technique, 
particularly  in  connect  ion  wi th  established  i i ne- j unct ion  techniques. 
Potent i a i  specialized  app ! i ca t i ons  of  the  technique  include  shadow 
detection  for  use  in  ra i sed-ob/ ect  cueing  and  camera-modei  recovery. 

A  detaiied  report  of  this  technique  is  being  prepared  for 
pub  I i cat  ion . 
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CORRELATION  SLOPE  INTERCEPT 


(a)  Extremal  Boundary —  notch  In  correlation  across  the  edge.  Slope  end  Intercept  in  the  low-correletion  eree 
are  meaningless. 

(b)  Cast  Shadow  —  sustained  high  correletion  across  the  edge  with  disturbance  of  one  or  both  regression 
parameters.  The  neture  of  this  disturbance  depends  on  the  sense  of  the  edge  (i.e.  whether  the  shedow  lies 
on  the  left  or  right),  end  on  details  of  the  imaging  end  digitizing  process.  In  practice,  nonllneerities  per¬ 
turb  the  correlation  slightly. 

(c)  No  Edge  Present  —  sustained  high  correlation,  no  disturbance  in  regression  parameters. 

FIGURE  1  IDEALIZED  REGRESSION  PLOTS 
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